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abstract 

Thirty students (73% female, M = 21 years) reporting significant distress, low self-esteem, and 
depressive symptoms were randomly assigned to three sessions of either: (a) restructuring of negative 
self-thoughts (via training and daily practice using the Thought Record) or (b) enhancement of positive 
self-statements (via fluency training and daily flashcard rehearsal). Both methods were associated with 
clinically significant improvement that persisted at follow-up. Using existing studies as benchmarks, this 
improvement met or exceeded that of related treatment conditions and clearly exceeded that of control 
conditions. Results suggest both disputation of negative and enhancement of positive self-thoughts can be 
beneficial. 
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Negative self-evaluation (i.e., low self-esteem) is theoretically and empirically associated with a 
range of psychological difficulties (e.g., eating disorders, social anxiety), but has been especially linked to 
depressive symptoms. Central to Beck and colleagues’ (1979) influential account of depression is the 
negative cognitive triad, which suggests that depressed individuals have a negative view of the self, 
world, and future. As Fennell (2004, p. 1058) summarizes: “Beck’s cognitive model identifies... negative 
thoughts about the self as central to the development and maintenance of depression.” Empirical findings 
support a significant link between negative self-statements, low self-esteem, and depression (Hollon & 
Kendall, 1980; Lewinsohn, Seeley, & Gotlib, 1997; Osman et al., 1997; Roberts, Gotlib, & Kassel, 1996; 
Smith & Betz, 2002) and the efficacy of cognitive-behavioral therapy as an intervention for depression 
has been established in large-scale clinical trials and meta-analysis (DeRubeis et al, 2005; Dobson, 1989; 
Gloaguen, Cottraux, Cucherat, & Blackburn, 1998). According to cognitive theory, negative self- 
statements result from maladaptive schemata that bias processing of the information taken in from the 
environment. Correction of these depressive schemata, via training in and practice of cognitive 
restructuring techniques, is hypothesized to be the critical ingredient of successful therapy (Beck et al., 
1979). The presence and influence of, and changes in, these schemata are not directly observed but are 
inferred from observations of negative self-statements, which the client verbalizes in interaction with the 
therapist or endorses on self-report measures. 

Behavior analysts reject explanations that require reference to hypothetical schemata that are in 
principle unobservable. However, clinical behavior analysts do not deny the high prevalence of negative 
self-statements among individuals described as depressed or having low self-esteem. Moreover, because 
an individual can serve both as a speaker and a listener with respect to his/her own verbal behavior 
(Skinner, 1957), they also do not deny that these statements can have effects, especially when they occur 
in a social- verbal context where their presence is considered indicative of psychological maladjustment 
(Dougher & Hackbert, 1994). Dougher and Hackbert (1994; 2000) describe how negative self-evaluations 
in response to insufficient reinforcement, punishment, or extinction likely serve to both elicit additional 
aversive stimulation and occasion depressive behavior. That is, the negative self-statements (e.g., “I’m a 
loser, Nobody likes me.”) might exacerbate feelings of sadness and function as establishing operations, 
altering the evocative effects of environmental stimuli (e.g., the sight of a group of peers serves as a 
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discriminative stimulus for punishment), increasing the reinforcing value of depressive behavior (e.g., 
avoidance of peers) and abolishing the reinforcing value of non-depressive behavior (e.g., approaching 
peers), potentially also contributing to the development of self-rules that further maintain depressotypic 
behavior (“Why bother trying to meet people, nobody likes me, I’m unlikable”). 

The preceding provides a behavioral rationale for potentially targeting self-statements 
therapeutically and for appreciating how cognitive therapy might have some beneficial effects from a 
behavioral perspective. There are currently several different views on how best to target self-statements 
therapeutically. Hayes, Strosahl, and Wilson (1999) promote altering the social-verbal context supporting 
a link between negative thoughts and depressive behavior. This is pursued through the use of cognitive 
defusion procedures, which target for change the function of thoughts without attempting to alter their 
content or frequency. Traditional cognitive-behavioral therapists on the other hand generally target the 
content of negative thoughts for change, which is pursued through the use of cognitive restructuring 
techniques designed to help the client to challenge and dispute negative self-statements so as to arrive at 
more rational, adaptive, and less extreme self-evaluations (Beck et al., 1979; Greenberger & Padesky, 
1995; Persons, Davidson, & Tompkins, 2001). A final approach that has received some attention in 
clinical studies and in the precision teaching literature emphasizes increasing the frequency of positive 
self-thoughts through structured identification, elaboration, and rehearsal of positive self-statements 
(Calkin, 1992; Lange et al., 1998). 

While it might be conceptually sensible to target negative self-statements in therapy, whether 
doing so is necessary or sufficient to produce change is an area that is currently being debated. In 
cognitive-behavior therapy for depression, modification of self-thoughts is considered part of the 
“cognitive” portion of the intervention. Beck and others (Beck et al., 1997; DeRubeis & Feeley, 1990; 
Hollon, 2000) have been clear in hypothesizing that the cognitive components aimed at modifying 
negative thoughts are primarily responsible for CBT’s efficacy. For example, Beck and colleagues (1979; 
p. 146) stated “The most critical stage of cognitive therapy involves training the patient to observe and 
record his thoughts.” However, time course analyses suggest that the majority of improvement occurs 
early in treatment, prior to the introduction of the explicitly cognitive techniques (Ilardi & Craighead, 
1994). Dismantling studies further suggest that behavioral activation alone is as efficacious and enduring 
as comparison conditions that added cognitive techniques (Gortner, Gollam, Dobson & Jacobson, 1998; 
Jacobson et al., 1996). Thus, cognitive modification techniques may not be necessary to the change 
process. However, these data do not address whether they might be sufficient. 

In the present study we focused on comparing one technique (i.e., the Thought Record) for 
challenging negative self-statements and a separate technique (i.e., Fluency Training) designed to increase 
positive self-statements. Isolating these techniques for evaluation allowed us to begin to test their 
sufficiency for producing change and also to compare their relative efficacy and potential unique effects. 

The Thought Record is considered one of the essential components of CBT for depression and is 
a primary vehicle used in attempting to modify negative self-thoughts (Greenberger & Padesky, 1995; 
Persons et al., 2001). Thought Record training involves teaching the client to identify negative thoughts, 
examine evidence for and against the negative thoughts, explore possible alternative explanations, and 
substitute more accurate, realistic, or less extreme thoughts. As such, the Thought Record is one of the 
most elaborated self-statement modification techniques available. 

While disputation of negative self-statements using the Thought Record involves generating less 
extreme or more adaptive self-statements, the focus is not typically on explicitly increasing positive self- 
statements (Lange et al., 1997). However, the possibility of increasing positive self-statements has been 
explored in several smaller scale studies. Philpot and Bamburg (1996) randomized college students 
reporting low self-esteem to either a control condition or a condition in which participants rehearsed a list 
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of 15 positive self-statements three times daily for two weeks. Significantly greater improvement in self- 
esteem and depression was reported in the rehearsal condition. Lange and colleagues (1998) randomized 
college students with low self-esteem to positive self-instruction training or a neutral task control 
condition. The intervention involved generating a list of positive personal characteristics, writing an essay 
incorporating them (session 1), and reducing the essay to a list of positive self-statements (session 2), 
which over the next three weeks was to be read twice daily. Compared to controls, the intervention group 
reported significant improvement in self-esteem. 

A parallel approach has also been developing in the field of behavior analysis, where Calkin 
(1981, 1992, 2000, 2002) has advocated applying precision teaching strategies to self- thoughts. Precision 
teaching involves identifying and counting a target behavior and increasing the rate of that behavior until 
“fluency” is established through short (e.g., 1-min) repeated timed practices,. A classic example is 
Lindsley’s SAFMEDS (say all fast a minute each day shuffled) method with flash cards. A performance 
is said to be fluent when the target behavior is not only accurate but also occurs at a high rate (i.e., is fast, 
automatic, or second-nature; see Binder, 1996; Lindsley, 1996). Calkin (1992) reported data from 35 
people using fluency training to increase positive self-thoughts and improve self-esteem. After a baseline 
during which positive and negative self-thoughts were self-monitored, participants were asked to write as 
many positive self-thoughts as they could during 1 -minute timings once per day. This intervention 
resulted in participants, on average, doubling their number of self-positives and reporting subjective 
increases in self-esteem. 

In the present study college students reporting significant distress, low self-esteem, and 
depressive symptoms were randomly assigned to either (a) Thought Record (TR) training or (b) Fluency 
Training (FT). Commonly used clinical measures were employed to evaluate the clinical relevance of the 
effects and to identify possible treatment specific effects. In addition, fluency with positive and negative 
self-thoughts was directly measured (and evaluated in comparison to normative data collected by the 
authors) providing additional information on treatment specificity. Follow-up data were collected at least 
one month post-treatment. 


Method 


Participants 

Undergraduate students from a large U.S. university who reported significant distress and low 
self-esteem were recruited via flyers and class announcements. Participants were screened using the Brief 
Symptom Inventory - Global Severity Index (BSI-GSI; Derogatis, 1993) and the Rosenberg Self-Esteem 
Scale (RSES; Rosenberg, 1989) and included if they scored one SD above the mean according to the adult 
non-patient norms on the BSI-GSI and one SD below the mean for a college population on the RSES (see 
Vispoel et al. 2001). Individuals endorsing strong suicidal ideation and those receiving other 
psychological treatment were excluded. Those receiving pharmacotherapy were enrolled if they had been 
on the medication for at least eight weeks. Thirty students met inclusion criteria, a total of nine people 
were excluded for failure to meet inclusion criteria, and no one met exclusion criteria. 

There were no statistically significant demographic differences between the TR and FT groups 
suggesting comparable groups were attained (see Table 1). Consistent with recommendations in the 
literature, our use of the BSI emphasized the global severity index as a measure of psychological distress 
(Boulet & Boss, 1991, as cited from Bufka, Crawford, & Fevitt, 2002). The sample BSI-GSI mean (SD) 
of 1.51 (0.30) exceeded normative means (Cochran & Flale, 1985) and means reported among a large 
sample of college students seeking services at a counseling center of a private university (Cornish et al., 
2000) by more than one standard deviation. In addition, the sample RSES mean (SD) of 22.47 (3.32) was 
1.9 standard deviations from a normative mean. Moreover, 46% had a history of mental health treatment, 
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often for mood problems (50%; see Table 1). One participant, a treatment completer diagnosed with 
bipolar disorder, was on Lexapro and Depakote for more than 2 months prior to and throughout 
participation. In sum, the inclusion criteria produced a relatively severe sample. 

Table 1 


Demographic and Past Treatment Characteristics 



Intent to treat 
(N = 30) 

Completers 
(n = 20) 

Variable 


TR 

(n = 10) 

FT 

(n = 10) 

Age 

21.33 (5.13) 

21.70 (7.20) 

20.50 (1.60) 

GPA 

3.13 (0.53) 

2.96 (0.61) 

3.36 (0.47) 

Sex (% female) 

73% 

80% 

80% 

Ethnicity (% Euro-American) 

90% 

90% 

90% 

Full-time student 

97% 

90% 

100% 

Yr in school: 

Freshman 

27% 

20% 

20% 

Sophomore 

33% 

40% 

30% 

Junior 

20% 

30% 

20% 

Senior 

20% 

10% 

30% 

Tobacco Use 

20% 

30% 

30% 

Hx of Mental Health Tx 

46% 

50% 

50% 

Tx focus: 

Depression 

4 

2 

1 

Bipolar 

2 

0 

2 

Depression + OCD 

1 

0 

1 

School refusal 

1 

1 

0 

Stress/Family Problems 

4 

2 

1 

Alcohol 

1 

0 

0 

Stress/Family + Alcohol 

1 

0 

1 

Hx of medication 

17% 

10% 

30% 


Design and Measures 

Participants were stratified by gender and then randomized to either Thought Record (TR) 
training or Fluency Training (FT), both of which consisted of three weekly treatment sessions. Measures 
were taken at pretreatment, post-treatment, and follow-up and consisted of common clinical self-report 
measures and a self-thought fluency assessment (STFA) procedure developed by the authors. To reduce 
potential demand characteristics, participants were informed that during the treatment portion of the study 
the experimenter was kept blind to all measures except for those used in determining eligibility (i.e., the 
BSI and RSES) and implementing the initial portion of the intervention (i.e., STFA). The following 
measures were collected: 

Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1989). The 10-item RSES asks participants to 
rate their level of agreement (range 0-40), with statements describing their general view of themselves. 
Higher scores indicate a more positive self-evaluation with a mean of 32.60 (SD = 5.25) established in a 
large nonpatient college sample (Vispoel et al., 2001). 
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Brief Symptom Inventory (BSI; Derogatis, 1993). This 5 3 -item questionnaire is designed to 
reflect psychological symptom patterns. Items are endorsed on a scale of 0 (not at all) to 4 (extremely). 
Normative means on the BSI-GSI with college students of 0.84 (SD = 0.55) for males and 0.71 (SD = 
0.42) for females were reported by Cochran & Hale (1985). 

Beck Depression Inventory-II ( BDI-II ; Beck et al., 1996). This widely used 21 -item self-report 
assesses the severity of depressive symptoms. The normative mean from a large collegiate sample is 9. 1 1 
(SD = 7.57) with recommended descriptors of 0-12 Nondepressed, 13-19 Dysphoric, 20-63 Dysphoric- 
Depressed (Dozois, Dobson, & Ahnberg, 1998). 

Suicidal Ideation Index. A suicidal ideation index was derived by summing items 9 and 39 from 
the BSI (“Thoughts of ending your life” and “Thoughts of death or dying”) and item 9 from the BDI 
(“Suicidal thoughts or wishes”). 

Automatic Thoughts Questionnaire-Negative (ATQ-N; Hollon £ Kendall, 1980). The 30-item 
ATQ measures the frequency of negative self-statements. Each item is scored on a 5-point scale, ranging 
from 1 (not at all) to 5 (all the time), with higher scores indicative more negativity. The mean among 
normative samples is 52.91 (SD = 18.18; Dozois et al., 2003). 

Automatic Thoughts Questionnaire-Positive (AT Q-P; Ingram £ Wisnicki, 1988). This 30- 
item instrument measures the frequency of positive self-statements and is scored on a scale from 1 (not at 
all) to 5 (all the time). The normative mean averaged across samples, and reported by Dozois et al. 

(2003), is 98.61 (SD = 13.02). 

Dysfunctional Attitudes Scale (DAS; Weissman & Beck, 1978). The DAS is a 40-item 
instrument is scored on a 1-7 scale. Lower scores indicate more adaptive beliefs. The mean among 
normative samples, reported by Dozois et al. (2003), is 1 19.01 (SD = 26.89). 

Acceptance & Action Questionnaire (AAQ; Hayes et al., 2004). The 9-item AAQ measures 
ability to take action despite uncomfortable thoughts/feelings. Each item is scored on a 1-7 scale, with 
higher scores indicating greater experiential avoidance and immobility. The mean for clinical populations 
is 38-40. For non-clinical populations it is 33.4 (SD = 7.2). 

Self-thought Fluency Assessment (STFA). Developed and pilot tested by the authors, this 
measure involves two separate 3-minute periods in which the individual is first given two minutes to 
collect his/her self-thoughts, and then one minute to write as many positive or negative as s/he can. After 
both positive and negative thoughts are generated, each is rated on a 5 -point scale for both personal 
importance (PI) and believability (B), with 1 being extremely important/believable and 5 being not at all 
important/believable. The following scores are derived from this procedure (data from a non-distressed 
college sample, N = 58, M age = 22 years, 57% female, are presented in parentheses): total number of 
positive thoughts (M= 9.86, SD = 3.00), total number of negative thoughts (M= 6.50, SD = 2.52), ratio 
of positive to negative thoughts (M= 1.68, SD = 0.68), average positive PI (M= 1.94, SD = 0.47) and B 
(M= 1.88, SD = 0.47), and average negative PI (M= 2.80, SD = 0.70) and B (M= 2.61, SD = 0.77). 

Treatment evaluation. This 1 1 -item questionnaire developed by the researchers asked 
participants to rate aspects of the treatment, the therapist, and their participation on a scale from 1 (not at 
all) to 5 (extremely). 
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Therapist 

The first author, a doctoral student in clinical psychology, conducted all of the treatment. She had 
completed an MA (and was board certified) in applied behavior analysis, had experience using precision 
teaching, completed graduate coursework in CBT, and a 2-year practicum at an outpatient clinic 
providing CBT. Additional training involved watching Thought Record instructional videos (i.e., APA, 
2000; New Harbinger Publications, 1996) and role-play practices. A Ph.D. level psychologist, trained in 
CBT and behavior analysis supervised. 

Treatments 

Participants in both conditions received three weekly therapy sessions. The first session lasted 
two hours: the first to cover the consent form, screening, rapport building, and pre-treatment assessment 
measures and the second to begin intervention. The second and third therapy sessions each lasted one 
hour and focused completely on the relevant intervention. The treatment conditions were brief due to their 
focus on specific therapeutic techniques. 

Thought record training condition. TR training focused on challenging and changing the 
participants’ negative self-thoughts using the 7-column Thought Record, as described and demonstrated 
by Padesky (Greenberger & Padesky, 1995; New Harbinger Publications, 1996) and consistent with the 
approach outlined by Persons et al. (2001). The Thought Record helps the user to identify negative 
automatic thoughts, the situations in which they occur and the associated emotions, the evidence for and 
against them, and, finally, to generate more balanced, adaptive thoughts. 

The therapist used the Thought Record as a framework for introducing the cognitive model to the 
participant, incorporating examples from the participants list of self-negatives obtained during the STFA 
to demonstrate the relationship between thoughts, moods, and behaviors (as recommended by Persons et 
al., 2001). After providing the rationale, the therapist and participant collaboratively discussed situations 
in which the participant felt badly about him/herself, identified negative thoughts, and then evaluated 
them using the Thought Record. This collaborative work provided modeling, guided practice, and an 
opportunity for clarification of questions about the Thought Record. The importance of practice was 
explained and copies of the Thought Record provided and assigned for homework; participants were 
encouraged to challenge all negative thoughts, but to formally record three per day. The second and third 
therapy sessions were spent reviewing the participants’ homework from the previous week, and 
challenging and practicing additional negative thoughts. 

Fluency training condition. FT focused on improving the automaticity of the participants’ 
positive self-thoughts by increasing both the number of positive thoughts s/he could readily identify and 
the rate at which s/he could identify them. During the FT psychoeducation piece, the therapist reviewed 
the participant’s list of positive thoughts from the self-thought assessment and asked him/her how s/he 
became good atx (e.g., Have you always been good at writing poetry? How did you improve?). In 
addition to using the personal example, the therapist also described learning to drive a standard shift car to 
illustrate the importance of practicing a new skill in order for it to become automatic and considered 
mastered. Lastly, the therapist explained how thinking differently is a new skill to be learned, one which 
needs to be practiced. 

During the FT practice, the participant first wrote his/her positive self-thoughts from the self- 
thought fluency assessment on index cards. Second, the therapist described how math flashcards have the 
problem on one side with the answer on the other. Similarly, the positive self-thoughts were considered to 
be the “answer” or correct response to be learned. On the opposite side of the card a “clue/trigger” that 
might occasion the correct response was identified by the participant in collaboration with the therapist. 
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These clues/triggers included a variety of situations/life domains (e.g., family relations, education, etc.), 
people, and activities. The participant then read the set of cards to him/herself, focusing on committing 
them to memory. Next, the therapist conducted flashcard drills with the participant until s/he could say 
her/his self-positives aloud without the cards. Fluency was assessed by three timed “mastery trials” in 
which the participant recited her/his set of positive thoughts aloud as quickly and accurately as s/he could. 
When the set could be articulated without omissions or hesitation during the timed in-session trials, the 
performance was deemed fluent. 

The participant was then asked to identify more positive self-statements in order to expand her/his 
original list. If the participant’s original list consisted of five self-positives, once it was mastered s/he 
would add five after each new set was mastered. This strategy provided individualized fluency training 
goals. To the extent possible, participants created their new self-positive cards independently and in 
collaboration with the therapist; however, to facilitate item addition, a list of life domains and a list of 
positive self-characteristics (provided to us by Calkin) was offered to prompt recognition of relevant 
items. Participants were also encouraged to use positive qualities that others had identified about them. As 
self-thoughts were identified, care was taken to ensure that they were not Pollyanna-ish, but instead had 
some referent in the client’s life experience which s/he could articulate. 

The participants were asked to carry their set of flashcards with them and to practice them as 
often as possible shuffling the cards between each practice, and to complete at least three formal flashcard 
drill practices per day. They were also asked to keep a journal of 1 -minute daily drills in which they wrote 
as many positive self-thoughts as they could for one minute. 

Results 


Treatment Fidelity 

Treatment adherence was measured using short questionnaires (available from the authors), one 
for each treatment session, which included three subscales: general therapy, TR-specific, and FT-specific. 
The general therapy subscale included items regarding issues such as provision of a clear rationale, 
establishment of a collaborative relationship, and bridging from the previous session. The other two 
subscales focused on use of the technique specific to one of the two treatments. All items were scored on 
a 6-point scale from 1 (not at all) to 6 (extensively). The treatment-specific subscales should differ 
whether the focus was TR or FT. Treatment adherence forms were completed immediately after each 
session by the therapist. In addition, a doctoral student in clinical psychology who was blind to condition 
observed 25% of the treatment session videotapes and completed the adherence forms. 

Agreement. A Pearson’s product-moment correlation demonstrated strong inter-rater agreement 
between therapist and coder item ratings (r = .88,;? < .001). Kappa was calculated by treating item 
adherence scores ranging from 1-3 (not at all - minimally) and 4-6 (considerably - extensively) as 
categorical, also resulting in very good rater agreement (K = .86 ,p< .001). 

Adherence. Average treatment adherence scores were calculated for each subscale. Scores of 4 
and above were considered to represent adherence. For TR sessions there was a significant difference 
between raters on the general therapy subscale, F(l,14) = 5.65 , p =.03. While both the therapist (M- 
4.81, SD = 0.65) and coder (M= 5.42, SD = 0.32) indicated adherence, the coder ratings were higher. 
There were no differences on the TR subscale (therapist M= 4.84, SD = 0.89; coder M= 5.07, SD = 

1.31), or the FT subscale (therapist and coder M= 1.00, SD = 0). These data indicate strong and specific 
adherence to the TR protocol. For the FT sessions, there were no significant differences between the 
raters on the general therapy subscale (therapist M= 5.24, SD = 0.55; coder M= 5.33; SD = 0.35) or the 
FT subscale (therapist M= 5.90; SD = 0.15; coder M= 5.94; SD = 0.11). However, the TR subscale 
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differed (therapist M = 1.10; SD = 0.14; coder M= 1.83; SD = 0.32, F( 1,12) = 30. 3 , p <.001. Importantly, 
neither mean was indicative of adherence to TR. Moreover, the difference was isolated to an item on the 
use of Socratic questioning, which was minimally used in FT during the generation of new self-positives. 
The therapist underrated the use of Socratic questioning, while the coder correctly identified this 
technique. These data indicate strong and specific adherence to the FT protocol. 

Acute Treatment Outcome 

Of the 30 qualifying participants, 20 completed the study (3 treatment sessions and post-treatment 
assessment), and 10 dropped out (a 33% attrition rate). Seven of these dropouts terminated after the first 
treatment session and three following the second treatment session. Attrition rates were comparable in 
both conditions: five dropped out from FT and five from the TR condition. When provided, reasons for 
dropping out of the study included family emergencies, other time commitments, and seeking therapy 
elsewhere. Dropouts did not differ significantly from completers on the RSES, F(l, 30) = 0.60, p = .45, or 
on the BSI-GSI, F(l, 30) = 0.43 , p = .52. 

Descriptive statistics for the clinical outcome measures and the STFA, as well as the results of the 
between-group and within-group comparisons conducted with completers are presented in Table 2 (results 
from the intent-to-treat sample are described in the text). Because of the large number of comparisons, 
alpha was set at .01 for these analyses. First, a series of ANCOVAs were conducted with post-treatment 
scores as the dependent variable and pre-treatment scores as a covariate. The ANCOVAs using the 
clinical self-report measures showed no significant differences between the two conditions on global 
distress, self-esteem, depressive symptoms, suicidal ideation, negative and positive thinking, and 
maladaptive beliefs. The same analyses were repeated based on an intent-to-treat approach (using a last 
data point carried forward method) and revealed similar non-significant differences. On the STFA, as 
expected due to the nature of the treatments, the FT group demonstrated a significantly greater number of 
positive self-thoughts. The intent-to-treat analyses also revealed highly significant treatment differences 
with respect to the total number of self-positives, F(l, 30) = 16.77,/? = .000. 

Paired samples t tests were used to analyze the differences between pre- and post-treatment scores 
within each condition (see Table 2). With respect to the clinical measures, statistically significant changes 
from pre- to post-treatment scores were seen across conditions, suggesting that individuals in both 
treatments improved. In the TR condition, statistically significant differences were found on global 
distress, self-esteem, depressive symptoms, suicidal ideation, negative automatic thoughts and 
experiential avoidance, while in the FT condition, statistically significant differences were observed on 
global distress, self-esteem, depressive symptoms, negative and positive automatic thoughts, maladaptive 
beliefs, and experiential avoidance. The STFA data hinted at the possibility of some treatment specific 
effects as both groups showed a significant improvement in the ratio of positive to negative self-thoughts 
on the STFA, but for different reasons. The FT group showed an increase in self-positives, while in TR 
group decreased in negative self-statements. 

Table 2 

Descriptive Statistics for Outcome Variables at Pre-treatment (Pre) and Post-treatment (Post) and Results of 

Between-Group and Within-Group Comparisons for Completers 

TR FT 



TR 

FT 


TR 

Pre 


Pre 


Pre 

Post 

Pre 

Post 


vs. 

vs. 

TR 

vs. 

FT 

(n = 

(n = 

(n = 15) 

(n = 10) 


FT 

Post 

CSC 

Post 

CSC 

15) 

Measures M 

10) 

M (SD) 

M (SD) 

M (SD) 

g 

F 

t b 

% 

t b 

% 


(SO) 
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Clinical 


BDI- 

IF 

22.26 

(6.08) 

12.00 

(6.07) 

26.22 

(10.18) 

13.70 

(7.45) 

0.24 

0.10 

5.22*** 

70 

4.62*** 

60 

BSI 

1.43 

(0.38) 

0.99 

(0.34) 

1.59 

(0.60) 

0.90 

(0.54) 

0.19 

0.84 

3.81** 

50 

3.92** 

60 

RSES 

23.33 

(3.39) 

29.15 

(2.29) 

21.60 

(3.11) 

27.60 

(4.09) 

0.45 

0.04 

_4 go*** 

80 

-5.09*** 

60 

SII 

1.53 

(1.55) 

0.40 

(0.97) 

1.13 

(1.30) 

0.50 

(1.08) 

0.09 

0.68 

3.28** 

— 

2.09 

— 

ATQ- 

N 

78.60 

(18.40) 

55.30 

(15.41) 

88.26 

(21.83) 

52.30 

( 12 . 86 ) 

0.20 

1.44 

4.63*** 

80 

6 . 20 *** 

90 

ATQ- 

P 

66.47 

(15.45) 

87.20 

(26.62) 

64.20 

(9.16) 

84.40 

(18.08) 

0.12 

1.34 

-2.60* 

70 

-4 34 *** 

40 

DAS 

160.73 

(29.27) 

142.00 

(19.90) 

178.60 

(19.87) 

146.20 

(28.25) 

0.16 

0.42 

3.12* 

60 

4 14** 

40 

AAQ 

STFA 

42.67 

(4.48) 

37.10 

(5.28) 

42.47 

(5.25) 

37.00 

(5.73) 

0.02 

0.06 

3.27** 

70 

2.73* 

70 

Total 

+ 

7.00 

(2.39) 

7.3 

(1.70) 

6.33 

(2.09) 

14.20 

(2.90) 

2.68 

70.55*** 

-1.33 

40 

H19*** 

100 

+ PI 

2.22 

(0.62) 

2.29 

(0.82) 

2.51 

(0.48) 

2.05 

(0.48) 

0.33 

1.93 

-0.38 

60 

2.30* 

50 

+ B 

2.35 

(0.61) 

2.09 

(0.52) 

2.43 

(0.64) 

2.27 

(0.55) 

0.29 

0.82 

1.64 

70 

1.34 

40 

Total 

7.53 

(2.64) 

6.30 

(1.57) 

8.13 

(2.13) 

7.80 

(3.52) 

0.51 

0.28 

3.00* 

80 

0.76 

50 

-PI 

2.29 

(0.72) 

2.84 

(0.69) 

2.30 

(0.71) 

2.71 

(0.83) 

0.16 

0.07 

-2.34* 

70 

-1.69 

40 

-B 

2.37 

(0.78) 

2.70 

(0.81) 

2.18 

(0.69) 

2.80 

(0.62) 

0.13 

0.24 

- 0.86 

60 

-2.41* 

80 

Ratio 

+/- 

0.94 

(0.14) 

1.20 

(0.30) 

0.80 

(0.28) 

2.10 

( 0 . 86 ) 

1.29 

7.17* 

-3.59** 

70 

-4.23** 

100 

PI Diff 

0.07 

(0.67) 

0.55 

(0.79) 

- 0.21 

(0.87) 

0.65 

(0.69) 

0.12 

0.92 

-1.76 

60 

-2.76* 

80 

B Diff 

0.03 

( 0 . 86 ) 

0.61 

(0.85) 

-0.25 

(0.97) 

0.54 

(0.69) 

0.08 

0.03 

-1.55 

70 

-3.16* 

80 


Note. + = positive self-thoughts, - = negative self-thoughts, PI = personal importance, B = believability, 
Diff = difference between change in positive and change in negative 

a For one participant who failed to complete the second side of the BDI at pre-treatment a prorated BDI 
total score was used. To the raw score from the first side (10) we added the sum of the item means for the 
questions on side two (7.26). The item means were taken from those reported by Beck et al. (1996). 
b In addition to the paired t tests, Wilcoxon Signed Ranks Tests, which use medians, were also conducted. 
The conclusions drawn from both types of analysis were identical in all cases. 

*p < .05, **p < .01, ***p < .001 


Effect Size and Clinical Significance. To supplement the ANCOVA and paired samples t test 
results, we calculated post-treatment between-groups effect sizes using Hedges’ g (see Table 2). On the 
clinical self-report measures, effect sizes were small in size at post-treatment (M= 0.18) and inconsistent 
in which treatment they favored. On the self-thought fluency assessment (STFA), large effect sizes were 
observed on the total number of self-positives ( g = 2 . 68 ) and the ratio of positive to negative self-thoughts 
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(g = 1.29) favoring the FT condition. With respect to negative self-thoughts, effect sizes favored the TR 
condition (g = 0.51). The remaining STFA effects sizes were small (M= 0.19, range 0.08 - 0.33). 

To supplement the comparison of the means, we calculated clinically significant change (CSC) 
according to criterion C from Jacobson & Truax (1991). Using the pre-treatment data from our sample 
and normative data on each measure (provided in the Design and Measures section) a cutoff score that 
placed the participant closer to the mean of the normative population than the dysfunctional population 
was established. The percentages of participants meeting the CSC criterion on the clinical self-report 
measures ranged from 50-85%, with a mean of 64%, indicating that the majority of participants showed 
clinically significant improvement. Averaging across measures, the percentages of participants meeting 
criteria were similar across the TR and FT conditions (Ms = 69% and 60%, respectively). On the STFA at 
post-treatment, the percentage of participants in TR and FT reaching CSC was 40% and 100% in total 
number of self-positives and 80% and 50%, respectively, in total number of self-negatives. 

Comparison to a no or minimal treatment control group. Given the lack of group differences 
between the TR and FT conditions, it is reasonable to ask if the pre to post changes observed were the 
result of the treatments being similarly efficacious or due to extraneous variables. The current design did 
not include a concurrent no or minimal treatment control condition to directly address this question, in 
part because two related studies did and found superior effects for the treatment condition. As in the 
current study, both Philpot and Bamburg (1996) and Lange et al. (1998) used undergraduate samples, 
included based on low self-esteem scores, with pre to post data collected at an approximately 1 month 
interval. As such, these findings can be used as a yardstick for evaluating the current results. Within- 
groups effects sizes on comparable measures are presented in Table 3 and suggest that for both the 
completer and intent-to-treat TR and FT samples, changes in self-esteem, depression, and negative 
thinking were large ( g = .79 - 1.9) and clearly exceeded the effects typically found in control conditions 
(g = .02-.23). 


Table 3 

Within-group effect sizes, Hedges ’ g, on compatible measures for completer (and intent-to-treat) samples from 
the current treatment conditions and relevant treatment and control groups from the literature . 



BDI a 

ATQ-N 

Self-esteem 13 

TR 

1.63 

1.30 

1.87 


(.79) 

(.80) 

(.87) 

FT 

1.31 

1.85 

1.65 


(1.11) 

(1.25) 

(1.18) 

Rehearsal 

0.99 

1.11 

1.38 

Positive Self-Instruction 



1.00 

No Treatment Control 

0.19 

0.23 

0.02 

Neutral Task Control 



0.21 


a Philpot & Bamburg (1996) used the BDI, while the BDI-II was used in the present study. 
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b Philpot & Bamburg (1996) used the Coopersmith Self-esteem Inventory, Lange et al (1998) measured 
self-esteem using self-evaluation subscales from the Dutch Personality Questionnaire, and the present 
study utilized the Rosenberg Self-Esteem Scale. 


Subsample with a treatment seeking history. Despite the relative severity of our sample, 
college student samples are commonly described as analog, distinguishing them from clinical samples. 
Fifty-five percent of our completer sample (n = 11, 5 in TR and 6 in FT) reported a history of mental 
health treatment, which was for mood problems in 55% of the cases. This subgroup might more closely 
approximate a clinical sample and address concerns about how the interventions described here would 
fare with this population. When the pretreatment scores on the clinical self-report measures between those 
with and without a history of mental health treatment were compared, the means for the former were 
higher in 6/7 cases; however, only the BDI (M= 29.0, SD = 9.54 vs. M= 20.36, SD = 6.95) reached 
statistical significance, F( 1,1 8) = 5.13,/? = .04. While those with a history of mental health treatment 
appeared more severe at pretreatment, both groups improved. The average change on the BDI was 1 5 
points for those with a treatment history and 9 for those without. Thus, at post-treatment the BDI means 
were not significantly different, F(l,18) = 0.84,/? = .37; M= 14.09, SD = 6.74 vs. M= 1 1.33, SD = 6.65). 
A scatterplot of the individual pre and post BDI data is presented in the upper panel of Figure 1 . The 
lower panel presents only the data from the 6 with a history of treatment for mood problems. 


O TR-hx • TR-no □ FT-hx ■ FT-no 
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o TR-moodhx • □ FT-moodhx ■ 



Figure 1. Scatterplot of individual pre (v axis) and post (x axis) BDI scores across conditions (TR 
= Thought Record training; FT = Fluency Training) for those with any mental health treatment history (hx 
= positive history, no = no history) in the upper panel and those with mental health treatment for a mood 
problem (mood) in the lower panel. The bisecting lines in the graphs represent the CSC criterion score. 


Follow-Up 

Follow-up data were obtained for all but one completer, who had moved to another state. Follow- 
up assessments occurred on average at about 5 weeks, but ranged due to scheduling conflicts (M= 5.42, 
SD = 4.21). The groups did not differ in time to follow-up, F(l, 18) = .92,/? = .35. Again, because of the 
relatively large number of between and within-group analyses alpha was set at .01. ANCOVAs with the 
follow-up scores as dependant variables and post-treatment scores as covariates revealed no significant 
differences between the TR and FT conditions (/? range = .10-. 85). On the clinical self-report measures, 
effect sizes were also small at follow-up ( M= 0.23, range 0.06 - 0.41). On the STFA, a large effect size 
was observed on the total number of self-positives favoring the FT condition (g- 1.30). With respect to 
negative self-thoughts, a large effect size favored the TR condition (g = 0.81). The remaining STFA effect 
sizes were small (M= 0.21, range 0.03 - 0.46). In terms of CSC, the percentage meeting criteria on the 
clinical self-report measures at follow-up (53-84%) did not change from post-treatment, and were similar 
across TR and FT (M= 69% and 63%, respectively). 

Paired sample t tests were used to compare post-treatment scores to follow-up scores within each 
treatment condition, revealing only 1 significant difference. In the TR group, scores continued to improve 
on the ATQ-P, t = -3.89,/? = .005. There were no other statistically significant changes indicating that 
improvements were maintained. 

Treatment Evaluation 

There were no group differences on any of the treatment evaluation items. Participants in both 
conditions rated the rationale for the treatment technique as “very” sensible (M= 4.05, SD = .51) and the 
techniques as “moderately to very” effective (M= 3.85, SD = .67). The therapist was rated as “very to 
extremely” effective in communicating and teaching the techniques (M= 4.70, SD = .57) and motivated 
(M= 4.30, SD = .80). Participants believed more contact with the therapist would have been only 
“somewhat to moderately” helpful (M= 2.85, SD = .93), rated themselves as “moderately to very” 
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compliant with the homework (M= 3.40, SD = .88), and to have mastered the techniques (M = 4.0, SD = 
.46). 


Discussion 

Both brief treatment conditions were associated with significant improvements in general distress, 
self-esteem, depression, depressotypic self- statements, and experiential avoidance. The changes observed 
during treatment were both statistically and clinically meaningful as post-treatment scores approached 
normative ranges for a majority of participants. Moreover, these improvements were not transient but 
were maintained at a follow-up assessment of at least one month. Thus, both TR and FT proved equally 
efficacious and, using the existing literature as a yardstick, equaled other related treatment conditions and 
surpassed outcomes in control conditions. 

Recent data from dismantling studies of CBT for depression call into question the necessity of 
cognitive techniques for producing change. The TR condition in the current study opens up the option that 
while not necessary, use of the Thought Record may be sufficient for producing change. However, in the 
absence of a comparison group that controls for common factors (i.e., a sensible rationale with associated 
techniques) caution is warranted in wholeheartedly adopting this interpretation. That said, there is an 
impressive amount of data supporting CBT for depression, cognitive restructuring is a core component of 
the treatment, and the Thought Record is a primary vehicle used in pursuing cognitive restructuring 
(Persons et al., 2001). 

Given the importance placed on the Thought Record in CBT for depression and its years of use 
and development, it is interesting that the FT intervention produced equivalent results. The positive FT 
data is consistent with the literature on the beneficial effects of increasing positive self-verbalizations 
(Calkin, 1992; Lange et al., 1998; Philpot & Bamburg, 1996) and extends it by using a sample that 
appeared more severe than those in previous investigations, comparing FT to another active treatment, 
and demonstrating the maintenance of gains over time. While these data provide some empirical support 
for targeting self-statements in therapy, they do not suggest a focus on self-statements to the exclusion of 
other treatment strategies, namely attempts to change overt behavior via behavioral activation, which has 
been shown to equal or exceed the results of comparisons conditions in which cognitive techniques were 
included (Jacobson et al., 1996). 

It is interesting to compare and contrast the self-statement measures. On the STFA, significant 
group differences were observed with respect to the total number of self-positives, as expected due to the 
nature of the FT condition. The effect sizes at post-treatment also indicate large group mean differences 
on the total number of self-positives and ratio of positives to negatives. Participants in the FT condition 
doubled, and in many instances nearly tripled, their total positive thoughts and the total number of 
negative thoughts stayed the same, thus improving the FT ratio of positives to negatives to 2. 1:1.0. In the 
TR condition, there were no changes in the total positive thoughts, but there was a decrease in negative 
thoughts, which improved the TR ratio of positive to negative self-thoughts to 1.2: 1.0. At follow-up, both 
groups had a ratio of 1.7: 1.0, due to a slight decrement in positive self-statements in the FT group and a 
greater decrement in negative self-statements in the TR group. Interestingly, the ratio of 1.7: 1.0 is nearly 
identical to the data from our local, non-distressed sample and corresponds to the ratio of 1.6: 1.0 that 
others have suggested represents a psychologically healthy balance (Kendall et al., 1989; Schwartz & 
Garamoni, 1989). Unlike on the STFA, the groups did not differ on the ATQ-N or ATQ-P. Instead both 
showed post-treatment ratios of 1.6: 1.0. Thus, the most consistent finding across the self-statement 
measures was the change in the ratio of positive to negative self-statements. 

For both groups, the average personal importance and believability ratings tended to increase for 
positives to decrease for negatives. The general direction of these changes, even though not reaching 
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formal statistical significance according to our corrected alpha level (p = .01), speak to a potential 
criticism of the treatments as being artificial, too structured, or in the case of FT, a rote memorization 
task. If the treatments were artificial, or if FT was simply memorization of generic positive thoughts, 
believability and personal importance would be expected to show no change, or maybe even decrease. 
Clinical behavior analysts have suggested that self-thoughts may have a number of overlapping functions: 

1) as a conditioned elicitor, based on either direct or indirect/verbal pairings with actual aversive events, 

2) as an establishing operation, altering the momentary reinforcing effectiveness and evocative functions 
of other stimuli, and 3) as a verbal self-rule or as the basis for establishing self-rules (Dougher & 

Hackbert, 1994, 2000). Believability and personal importance ratings may be a crude proxy measure for 
these functions (Wilson, Hayes, Gregg, & Zettle, 2001; p. 229). For instance, a self-thought that is low in 
believability and personal importance may be one that is a weak conditioned elicitor, functions as only a 
mild establishing operation, and fails to be a basis for generating self-rules. Conversely, a self-thought 
that is high in believability and personal importance may be a stronger conditioned elicitor that also 
functions as a more significant establishing operation, and serves as a basis for generating self-rules. This 
analysis, while plausible, is entirely speculative at the moment. 

Given the short duration of the TR and FT conditions, it is worth noting that brief therapy is 
commonplace. Benton et al. (2003) reported that the mean number of sessions received at their university 
counseling center was six, while a national survey found 73% of campus counseling centers averaged 3-6 
sessions per client (Stone, Vespia, & Kanz, 2000). Moreover, in a large national sample of clients seeking 
psychological services, the median number of sessions attended was less than five (Hansen et al., 2002). 
These data suggest that the development and evaluation of focused, brief intervention strategies appears 
important. That said, it is important to note that while the treatment gains achieved were impressive, on 
most of the clinical measures there was room for additional improvement and there were individual 
differences in treatment response. 

There were a number of limitations in the present study. One is the generally small sample size, 
which reduced statistical power for finding between group differences. That said, the between group 
effect sizes in the present study were not large or consistent on the clinical measures indicating that 
extremely large samples would be needed to find group differences that, if found, would not reliably favor 
one condition. A second consideration is the attrition rate. The 33% attrition rate is not atypical in clinical 
trials, but is worth noting given the brevity of the interventions offered. Attrition was not associated with 
increased severity of distress, lower self-esteem, or group assignment. It is possible that attrition was due 
to the brevity of the treatments offered rather than in spite of it. That is, when only one technique is being 
offered, if the rationale for that technique does not readily resonate with the participant, there is less 
incentive to stay in treatment than there would be when offered a multi-component treatment package. 
Another limitation is the absence of data on overt behavior changes and reliance on self-report 
inventories, which may be influenced by demand characteristics, a Hawthorne effect, or repeated testing. 
To provide some protection, we employed commonly used clinical measures that have sound 
psychometric properties and kept the therapist blind to as many of the measures as possible during 
treatment. In addition, we added the STFA, which sampled actual behavior under standardized conditions, 
providing at the very least a manipulation check documenting that the treatments had some unique effects. 
Lastly, this study lacks a concurrent waitlist or supportive therapy control group. Given the comparisons 
between the present results and the extant literature, it seems reasonable to conclude that TR and FT are 
better than no treatment. An important next step is comparison to a supportive therapy group, which 
would control for the effects of non-specific factors. In addition, because of the specific targets of TR 
(challenging negative thoughts) and FT (increasing positive thoughts) another interesting future 
comparison would be with cognitive defusion procedures from Acceptance and Commitment Therapy 
(Hayes, Strosahl, & Wilson, 1999). Cognitive defusion procedures emphasize changing the function of 
thoughts rather than their content or frequency and recent data with non-distressed college students 
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suggested a defusion technique could reduce the discomfort and believability of negative self-thoughts 
(Masuda, Hayes, Sackett, & Twohig, 2004). 

In a relatively severely distressed college sample, three sessions of TR or FT were associated with 
significant and sustained improvements according to commonly used clinical indices. These data support 
the feasibility, acceptability, and potential utility of implementing both strategies clinically and warrant 
consideration in future research, especially research exploring treatment specific effects and attempting to 
identify relationships between specific techniques and mechanisms of change. 
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